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Chapter 10. Adaptive signal processing 
10-1 Basic theory 
10-1-1 Introduction 

A filter that has a so-called learning function, that is, a function with which system 
parameters (such as the impulse response) are successively estimated from the input signal and 
output signal (desired signal) of an unknown system. It is applied in an echo canceller, automatic 
equalizer, noise canceller, etc. [l]-[9]. Here, the portion of the adaptive filter for correcting the 
filter coefficient of the adaptive filter is called an adaptive algorithm. 

Topics to be addressed for an adaptive algorithm include simultaneous realization of 
higher speed convergence, higher speed execution speed, and smaller scale hardware, etc. 
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However, in research on higher speed by using parallel processing, there are opposite 
requirements. Also, it is necessary to consider the stability of operation, and many researchers 
have proposed various schemes. 

First, in 1 960, in their research on adaptive switching circuits, Widrow and Hoff 
proposed an adaptive algorithm called the Widrow-Hoff least mean square algorithm (hereinafter 
to be referred to as LMS algorithm) [10]. For this algorithm, the filter coefficients are corrected 
so that the mean square error is minimized based on the deepest fall method. Because it has a 
small operation quantity, it still holds the position as the typical algorithm at present. 

In 1967, independent of the aforementioned work, Noda and Nagumo proposed a 
learning (type) identification method [11], [12]. The learning identification method has better 
convergence characteristics although it has a little poorer performance with respect to the 
operation quantity than the aforementioned LMS algorithm. Consequently, it is an adaptive 
algorithm with excellent practical application. The learning identification method has another 
name, that is, the normalized LMS algorithm. It normalizes the coefficient correction item of the 
LMS algorithm by means of the state vector norm of the filter. Consequently, the learning 
identification method can also be deemed to be located on the extension line of the LMS 
algorithm. When it is used as an adaptive algorithm based on the orthogonal map theorem, 
highly interesting extension is possible. 

In any case, the aforementioned LMS algorithm and the learning identification method 
are typical adaptive algorithms that support adaptive signal processing entering the stage of 
practical application. The aforementioned algorithms can be deemed to be repeating computation 
methods that can solve the Wiener-Hopf equation generated based on a statistical quantity of the 
signal even if the statistical properties of the given signal are unknown (or almost unknown). 
Also, even if the parameter to be estimated varies in a relatively slow way over time, it is still 
possible to track the variation in the parameter to a certain degree. However, it has been pointed 
out that when there is an input signal, said algorithms have a significantly diminished 
convergence speed. This is undesired. 

On the other hand, in 1960, at the same time the LMS algorithm was published, Kalman 
proposed a discrete time Kalman filter [13]. It is believed that real time treatment with the 
algorithm proposed by Mr. Kalman is difficult even now. However, it is a famous theory that 
dramatically extends Wiener's idea. Also, for the Kalman filter, as an unknown parameter that 
requires estimation of a state variable, by assuming that the parameter does not vary over time 
(including irregular fluctuation), the Kalman filter agrees with the well known recursive least 
square algorithm (RLS) [14]. For the RLS algorithm, assuming that the number of parameters to 
be estimated is N, O (N ) rounds of multiplication for each sample are necessary, so realization 
in hardware is difficult. However, when the aforementioned assumption stands, excellent 
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convergence characteristics are displayed. As a measure for coping with variation in an unknown 
parameter, the introduction of forgetting coefficient X has been proposed. However, depending 
on the value of X, numerical instability may take place. Consequently, the hitherto estimated 
value may jump to another value, or the algorithm may be unable to continue. Caution should be 
exercised. However, when the UD decomposition method is introduced into the RLS, such 
phenomenon can hardly take place. At the same time, effectiveness in forming hardware by 
means of a systolic array is also displayed [15]. 

Block signal processing based on the blocking of the input data and FFT has the excellent 
characteristic feature that the necessary number of rounds of multiplication operation can be 
reduced with respect to processing in the time zone. If such block signal processing is introduced 
into the filter portion of the adaptive filter, the adaptive algorithm also requires a fast processing 
matching it. In order to meet such requirement, G. A. Clark et al. [16] proposed BLMS (block 
LMS) algorithm [17-19]. Also, as a characteristic feature of block processing, studies have been 
performed on how to increase the processing speed of an adaptive filter by introducing the 
concept of parallel processing to filter computing and the adaptive algorithm. Also, the jump 
algorithm [27] has been proposed for increasing the convergence speed. 

On the other hand, a plurality of input data has also been proposed to increase the 
convergence speed in input of colored signals instead of for increasing the processing speed, 
according to research performed by [illegible]moto, Maekawa [20], and Oseki, Umeda [21]. 
Their schemes are characterized by the fact that the estimated coefficient at a certain time gives 
the desired output power with respect to plural state vectors required in obtaining it. However, 
these schemes have the problem that the operation quantity is huge for each sample time. In 
order to reduce the operation quantity, Furukawa et al. introduced the concept of block 
processing as explained above in the processing described in References [20], [21] and proposed 
the BOP (block orthogonal projection) algorithm as its general representation [22]. A more 
specific algorithm is described in references [23], [24]. 

In the present chapter, if not specified otherwise, matrix A with dimensions (N x N) and 
vector B of (N x 1) are referred to as An,n and Bn respectively. 

10.1.2 Setting of topic and evaluation quantity 

In the following, an explanation will be given regarding the topic setting and evaluation 
quantity needed for deriving the adaptive algorithm. In the following discussions, all signals will 
be sampled using an appropriate scheme, and the system as the object will be represented in a 
discrete time zone. 

Before going to topic setting, first, the definition of error will be explained. It is believed 
that the following three types of errors exist. 
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(1) Output error As the most frequently used definition of error, as shown in Figure 
10.1(a), difference z(k) - y(k), where z(k) represents the observation signal and y(k) represents 
the output of the estimation system. 

(2) Input error As shown in Figure 10.1(b), this error is defined as the difference 
x(k) - y(k) in output y(k) of the estimation system, where x(k) represents the input signal, and 
z(k) represents the observation signal. In this case, the estimation system is taken as the reverse 
system estimation of the unknown system. 

(3) Generalized error As shown in Figure 10.1(c), this error is defined as a combination 
of the aforementioned output error and input error. Especially, assuming that the transmission 
functions of estimation system 1 and estimation system 2 are A(z) and B(z), one has 

In this case, the generalized error e(k) is represented as follows: 
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Figure 10.1. Definitions of errors. 



Key: (a) Output error 
(b) Input error 
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(c) Generalized error 

1 Unknown system 

2 Estimated system 

3 Estimated system 1 

4 Estimated system 2 

Here, in order to make the explanation more concise, discussion is performed by 
constraining to an unknown system that outputs a signal defined below [x(k)* 

with respect to the input signal [d(k)» ^*»< 4<*»]_ Here, k represents the sample No. 
(corresponding to the time); #%, fc&r-t represent the impulse response to be estimated, 

and M represents the number of impulse responses. In addition, x(k) represents the probability 
process. Here, when the impulse response sequence is presented in vector representation, one has 

where, T : transposition of the vector. 

On the other hand, consider another FIR digital filter having the following input/output 
relationship: 

y(« = g^xa-i') (10 * 5} 

Here, hi represents the ith filter coefficient (impulse response). Coefficient hi may be rewritten to 
any value. Similarly, Wm represents 

(fa, &u ;«% fcr-t) r (1$ * 6) 

Here, when a certain evaluation quantity is defined concerning the "distance" between the 
aforementioned FIR digital filter and a parameter of the unknown system shown in formula 
(10.4), and coefficients hi of the FIR digital filter are corrected so that the aforementioned 
evaluation quantity is minimized, this filter is called an adaptive filter, and the obtained 
coefficient hi is called an estimated value. 

Here, the topic is how to select evaluation quantity J. Here, as shown in Figure 10.1(a), J 
can be deemed to be the mean square of the output error. Figure 10.1(a) shows updating of the 
adaptive filter coefficient so that the mean square of the difference between observation signal 
z(k) and output y(k) of the estimation system becomes minimized. According to (a) in this 
figure, evaluation quantity J is given as follows: 

- W it) }*] do * 7) 

In consideration of the objective of estimation of the parameter, it is important to take the 
"distance" between the unknown system and the estimation system as a direct evaluation 
quantity. However, because the parameter of the unknown system is unknown, it is impossible to 
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directly use the evaluation quantity, and the mean square of Equation (10.7) is often used as the 
evaluation quantity. 

10.1.3 Formulation of the topic for parameter estimation 

In the following, an explanation will be given regarding the foundation of parameter 
estimation and the general properties of the solution obtained in this case. As explained in 
section 10.1.2, the topic of parameter estimation can be formulated as a topic of minimization of 
the appropriately defined evaluation quantity. As shown in Equation (10.7), the evaluation 
quantity J is given by 

j~Bie*im « Eii id it) +vim n 

(10 * B) 

In Equation (10.8), by substituting the following input/output relationship of the adaptive filter 

one can obtain the following secondary function pertaining to h N in the matrix representation: 

(10 * 10) 



Here, one has 



Awrtt) ^E[xAk)xiim (10 - ll) 

v»ik) - B ixAkUik) ] (10 * 12) 

z{k)^d{k)+v(k) (10 - 13} 

*W~wf**Uk) (10-14) 

(10 - 15} 

{&>, k u % kx-tf (30 * IS) 

Wm^ (m* m« * WM~t}* (10 * 17) 



As can be seen from Equation (10.10), the topic of minimization of J becomes a so-called 
constraint-free optimization topic. In addition, it is well known that An,N(k) is a positive constant 
value, and J is the most typical convex function pertaining to h N , and it has a unique minimum 
value [25]. Here, optimum coefficient vector h N for minimizing J at time k is represented as 
h N (opt, k). Here, h N (opt, k) is obtained by partial differentiation of both sides and setting the 
result at 0. That is, both sides of Equation (10.10) are partial differentiated with respect to h N , 
and one obtains 

^jL^2A#ji gt) k*-2m*W (10 - 18) 

Consequently, one has 

a* (ope, *) -Afoot) m*> do * is) 

here, assuming that A N)N (k) has regularity. Equation (10.19) is called the Wiener-Hoff 
solution. 
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In the following, a brief explanation will be given regarding the relationship between true 
value vector W M and h N (opt,k). Here, as the magnitude relationship between M and N, one has 
M > N. In this case, consider that the initial N components and the remaining components of the 
true value vector are divided as follows: 



(20 * 28) 

As a result, the output (observation signal) z(k) from the unknown system becomes: 

(10*21) 

Consequently, VN(k) can be transformed as follows: 

p*(k) ~Elx» (k) *xUk) ]F* 

In this case, h N (opt,k) is given by Equation (10.19) as follows: 

*#(opt Jt) » *V+ Ar(*) (10 - 23} 

Here, bias vector B>i(k) is given as follows: 

(W * 24} 

As a specific case, it is assumed that 

£[x*<*)-jc£-jt<*-A03*© (10 * 25) 

Efartt) ~pOt)}=Q (10 * 26) 

As a result, optimum coefficient vector nN(opt, k) becomes: 

*»(ojit*}«Fjr (10-273 

That is, independent of time, it is in agreement with the initial N components of true value vector 

W M - 

On the other hand, when M < N, bias vector B N (k) is given as 
Ar(ifc)«A^<«JEtrfa) •*<*>] (10 • 28) 
Consequently, if Equation (10.26) is established, the following relationship, like Equation 
(10.27) is obtained: 

Mope.*)**' (10-29) 

Here, one has 

Wm 



Wm~ ( uh> —wm^ K OtOi ~% 0 T {10-30} 

Wr>mm 

Even if the aforementioned Equations (10.25), (10.26) are not established in the strict 
sense, they usually still may be assumed in practical application. Also, when the influence of bias 
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vector BN(k) cannot be ignored, the following scheme may also be adopted: BN(k) is adaptively 
estimated and subtracted from the optimum coefficient vector. 

The constitution for determining the optimum coefficient vector based on Equation 
(10.19) contains an averaging operation and computing of an inverted matrix as shown in Figure 
10.2, so it is inappropriate for real time processing. Here, as a method for high efficiency 
successive performance of the aforementioned averaging operation and computing of the 
inverted matrix, the recurrence least square (hereinafter to be referred to as RLS) method [14, 
26] has been proposed. In addition, because AN,N(k) is a Toeplitz matrix, several methods are 
known that can significantly improve the computing procedure of the RLS [15, 28]. Also, 
Equation (10.19) can be solved by means of a planning method (descending method). 



*(*) 




Figure 10.2. Constitution of Wiener-Hoff filter (assuming an ergodic property of the signal). 



Key: 1 Unknown system 

2 Group of delay units 

3 Mutual correlation 

4 Self correlation 



The most typical method is the steepest descending method. Fast computing schemes belonging 
to them include the LMS algorithm [10], jump algorithm [27], etc. In addition, there are also the 
learning identification method and other computing schemes based on the orthogonal mapping 
theorem [1 1], [20-22]. 

In the following section, application examples of a typical adaptive algorithm and 
adaptive filter will be explained. Here, it is assumed that all signals to be handled in the 
following discussion are steady state probability processes, and an ergodic property is assumed. 

10.2 Recurrence least square (RLS) method 

In the following, an explanation will be given regarding derivation of the RLS. Here, 
instead of solving Equation (10.19) once, the RLS is determined by solving liN(opt) while An,n 
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and v N are determined using the recurrence method. Consequently, the estimated vector Iin is 
updated for each sample, so that h N (opt) is gradually approached. 

Here, by assuming that the signal has a steady state property and an ergodic property, the 
following quantity is defined: 

(10 * 31) 

As a result, the estimated values A NjN , Vn of An,n and vn at time k become: 

(m * 32} 

(111 - 33} 

Here, 

ao-.u) 

Here, 

From Equation (10.19), one has 

(18 - 37) 

Consequently, it is possible to set the estimated vector of Iin using the data obtained until time k 
as h N (k + 1): 

^x^w^^a) as * 38)- 

It is clear that when k -> oo, Equation (10.38) agrees with h N (opt). 

The next problem is transformed to a recurrence computing representation in which the 
right side of Equation (10.38) is computed while the data that have been obtained are efficiently 
used and new data are added. For this purpose, the following quantity is defined: 

Because Equation (10.38) can be transformed as follows: 

+2* (*)«(*)} (10-40) 
so that from the well known formula of the inverted matrix 
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(A+BO-^A-'-A-'B (/+ CA-'S)- CA" 

(10 • 41) 

and Equation (10.39), the first item becomes 

IXkLx-n j, (& — I) X*-„+ui, {k — I) + xS (k}xm (&)]-' 



"flr^U-D-ArfWaJ^IVjrOfc-!) (1& -425 

Here, one has 

'*•<»• i+JulKM f jLu r «<>•«> 

Here, because the left side of Equation (10.42) is P N;N (k), P N;N (k) can be obtained from 
P N ,N(k - 1): 

(10 > 44) 

Consequently, Equation (10.40) becomes 

• lX£-#+%» Or— I) J5^- W4 * {£— 1) 
3 

- kit ik)xi ihiPn* (k~ 1) 

In addition, as to be explained later, Equation (10.45) can be transformed to the following 
representation. First, from Equations (10.38), (10.39), one has 

Ct~I) JK^+ij* (*— 1) Sfc^jr+iOfc— l) 

* X£~#4%m (Jk — * 1) — ' 1) 

~*»<*> (10 * 45) 

Also, from Equation (10.43), one has 

k* {&} [1+ JfcJ{*> J*.* <*- Gt}] 

^P##(k— l)x»(k} 110*47) 

This can be transformed to 

(m • 48) 

Here, In,n represents a unit matrix having N rows and N columns. Consequently, from Equations 
(10.46), (10.48), the aforementioned Equation (10.45) can be transformed to the following 
recurrence computing system: 

(i0 * m 
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The aforementioned can be summarized, and the computing procedure for the RLS listed in 
Table 10.1 can be obtained. 

Ta ble 10.1. Co mputing procedure 1 for the RLS. 



The RLS listed in Table 10.1 may also be transformed as follows. First, for both sides of 
Equation (10.44), lN(k) is multiplied on the right, and by using Equation (10.48), one obtains 

(*> £S W (Jt- 1} X* (t) 

. (10*50) 

k N (k) obtained using Equation (10.50) is substituted into the No. 3 step of the procedure listed in 
Table 10.1, and the No. 1 step of the procedure in Table 10.1 is substituted into the No. 2 step, 
and one obtains the procedure listed in Table 10.2. 



Table 10.2. Computing procedure 2 for the RLS. 



When the procedure listed in Table 10.1 or Table 10.2 is executed, initial values h N (0), 
Pn,n(0) have to be determined. They may be selected as follows: 

(arbitrary) ~1 
Aj^GJ-dSrj* (C is a very large positive number) J (10 51) 



For details, the reader is referred to References [14], [26]. 



10.3 LMS algorithm 

Here, the LMS algorithm [5], [10] (hereinafter to be referred to as LMS) and the steepest 
descending method as its foundation will be explained. First, a brief account will be given on the 
steepest descending method. 
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Figure 10.3. Contour lines of E[e 2 (k)], 



At any Iin, the gradient vector Gn(1i n ) is defined as 

gam ^2Ajtj*hM~zvM m - m) 

(see Equation (10.18)). The secondary form of parameter Iin is present in Equation (10.10), and 
only one Iin corresponds to the minimum evaluation quantity J. Figure 10.3 shows the state when 
N = 2. The curves shown in Figure 10.3 correspond to collection of the contour of the values of J 
when coefficients ho, hi vary. Here, Gn(Iin) is equal to the gradient at any coefficient h N , and it is 
in agreement with the direction of the normal to the contour lines. Consequently, by setting any 
point 1in(0) as the initial value and moving 1in(0) appropriately in the direction of - Gn(1in(0)), J 
at 1in(1) can be smaller than the value of J at 1in(0). Here, h^Q - 1) represents the jth corrected 
value of hN- By repeating this process, IinG) infinitely approaches h N (opt). The aforementioned 
algorithm can be summarized as follows: 

JfcjrO") ~ -0-5* W Gm (km </ - D }» 

{10*53} 

Here, Equation (10.53) is called the steepest descending method, and a(j) is called the step gain. 
Here, for the aforementioned step gain, coefficient 0.5 is set for simplifying the later 
transformation of equations, and it does not have any special meaning. This constitution is shown 
in Figure 10.4. 
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Figure 10.4. Constitution for determining the Wiener-Hoff solution using the steepest descending 
method. 



Key: 1 Group of delay units 

2 Mutual correlation 

3 Unknown system 

4 Self correlation 

5 Primary conversion 

6 Delay 

In the following, an explanation will be given regarding variation in the distance between 
hn(opt) and h^Q) when Iin is corrected according to Equation (10.53). For this purpose, the error 
vector is defined as 

B*W**k*W -fcrfopt) (16 * 54) 

Here, from Equations (10.19), (10.52), and (10.53), the aforementioned Equation (10.54) can be 
transformed as follows. 

* ■ J5* (/- 1) -tfO) 0-1) 

1 = 1,2,3, - (10*55) 
Consequently, variation in the error vector En(j) is given as 

» E * (*) Aw) *E» (0) (10 * 56) 
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It is known that whether the magnitude of En(j) decreases with respect to an increase in j 
depends on step gain a(j) and A N , N (property of the input signal). In order to show the 
aforementioned property more clearly, A N , N is transformed as 

Here, Qn,n represents an orthogonal matrix having the eigenvector of A NjN in the column vector: 

Qfa^Qmb (10-58) 
Also, D NjN is a diagonal matrix having the eigenvalues of A N , N as its diagonal elements. 

ttO -59} 

Consequently, one has 

- Q» j*[Z#j* ~~ a (/) Dm A 
Ql„ (10-60) 

From Equations (10.56), (10.58) and (10.60), one has 

(Ixji — <r(n) £W) 
(X0 ' 61) 

Here, when the value of a(j) is selected to be the reciprocal value of eigenvalue Aj of A NjN for 
each correction, one has 

In the Nth and later corrections, h N (j) is in agreement with h N (opt). For example, when a white 
signal of appropriate magnitude is taken as the input signal, An,n becomes a unit matrix. 
Consequently, assuming that a(j) = 1 (j = 1,2,...), it is possible to obtain the optimum solution in 
a single round of correction. When the step gain is fixed without change each round, it is 
understood that the magnitude of the eigenvalue of An,n can quickly approach the optimum 
solution uniformly. On the other hand, for an audio signal and other color signals, the ratio of the 
maximum eigenvalue to that of the minimum eigenvalue is very large, and, when the step gain is 
fixed, the convergence speed degrades significantly. Also, the property described here is retained 
to a certain degree even for the next LMS. 

The above explanation is a discussion wherein the statistical quantity (A Nj n, v n , or 
GnOin)) is known. However, in practical application, usually, there is no time allowed to 
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compute the aforementioned statistical quantities. In the following, an explanation will be given 
regarding the LMS proposed by Widrow and Hoff. 

When the averaging operation is omitted in Equation (10.53), Equation (10.53) can be 
transformed to 

(see Equations (10.52), (10.1 1) and (10.12)). The LMS can be obtained from Equation 10.63 by 
assuming j = k + 1 and a(j) = a. That is, 

k# (&+ 1} s= k# (k) — alw (k) ~ d (&} (£} 

In this way, from the data at time k, estimated vector Iin for subsequent use can be obtained 

repeatedly. Also, the selection of step gain a is determined from the statistical properties of the 

input signal as explained with regard to the steepest descending method. In the next a range 

2 



0<a<~rr 

5* 



(W • 55) 



it is known that evaluation quantity approaches 0 [20]. Here, X^ represents the 

maximum eigenvalue of An,n* Also, initial value h N (0) is at any point. This constitution is shown 
in Figure 10.5. 




Figure 10.5. Constitution of LMS algorithm. 



10.4 Learning identification method 

In the following, an explanation will be given regarding the learning identification 
method proposed by Nagumo and Noda [1 1]. In the following discussion, if not specified 
otherwise, it is assumed that the unknown system and the known system have the same order 
number (M = N), and that no observation noise v(t) exists (v(t) = 0). 

Now, consider the case when at time k, output y(k) of the adaptive filter is equal to output 
d(k) of the unknown system. That is, one has 
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r & i 



{10-66) 

It is clear that parameter Wn of the unknown system (true value vector) satisfies equation 
(10.66). However, parameter h N (estimated vector) of the adaptive filter may not necessarily be 
equal to Wn- When Equation (10.66) is established for all input signals, h N = Wn- In this way, h N 
that satisfies equation (10.65) becomes a collection of all solutions including the true value 
vector. Here, as shown in Figure 10.6, the representative vector hN(k + 1) of hN that satisfies 
equation (10.66) is taken as the foot of the perpendicular line drawn from an appropriately 
defined arbitrary point to the collection of solutions. As can be seen from Equation (10.66), the 
solution collection is perpendicular to state vector xn(1c). In addition, because Wn is contained in 
the aforementioned solution collection, liN(k + 1) is the point nearest Wn when the coefficient is 
corrected from a certain point in the direction towards XN(k). 




Figure 10.6. Solution collection. 



Key: 1 
2 



Solution collection 
Certain point 



Figure 10.7. Geometric relationship of solutions obtained using the learning identification 
method. 
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In order to repeat the aforementioned operation such that h N (k +1) approaches W N , one 
may take h N (k) nearer to W N than an appropriately determined point as the initial value of the 
following coefficient correction. This is shown in Figure 10.7. In Figure 10.7, Il(k - 1) and Il(k) 
represent solution collections at times k - 1 and k, respectively. In other words, they are 
collections of adaptive filter coefficients equal to desired signals d(k - 1), d(k) at times k - 1 and 
k, respectively. Also, because solution vector Wn refers to the point that becomes the desired 
signal d(k) with respect to all state vectors '[^if {£) (~«»<|:<ptf)3 3 j t j s positioned at the 
crossing point of all solution collections U&(&) {r~ t &< t £<&>}'} 

The aforementioned can be summarized as follows: 



Key: a Correction quantity 
b Correction direction 



Here, 1 * I represents the Euclid norm of the vector, and it is defined as the square root of the 
sum of the squares of the elements. Here, because 

~«<A> (10-68) 

one may transform Equation (10.67) to 

^U+D-ferW+TgHlir^tt) £H « TO) 

The learning identification method can be realized by multiplying the correction vector of 
Equation (10.70) by the step gain to obtain 

If said assumption {M~N* s?{i) ™0) [ s established, adaptive filter coefficient h N (k + 1) 
updated by means of Equation (10.70) gives the desired signal with respect to state vector x N (k) 
required for updating and independent of the initial value. In addition to the aforementioned 



18 



characteristic feature, even if the parameter of the unknown system varies, the aforementioned 
characteristic feature can still be retained if it is constrained to time k. 



10.5 Block adaptive algorithm 

In the following, an explanation will be given regarding the BOP algorithm (hereinafter 
to be referred to as BOP) that has the concept of the block treatment introduced to it in the 
learning identification method described in Section 10.4 and the BLMS algorithm (hereinafter to 
be referred to as BLMS) in the frequency zone. 

10.5.1 BOP algorithm 

In the following discussion, if not specified otherwise, it is assumed that the unknown 
system and the known system have the same order (M = N), and that no observation noise v(t) 
exists (v(t) = 0). The equation corresponding to Equation (10.66) is 

Here, at time k, state matrix X r N(k) and desired signal vector d r (k) are given as follows: 

xik) x(.k-l) «- 



(10 • 73) 



d r <*) =s id (*), d{k~ 1), «<\ d (k- r +1) 3 r 

(10 - 74) 

Here, r represents the quantity known as the block length. The treatment is performed with this 
block length as a unit. Also, there is a precondition that the block length be not over the size of 
the coefficient vector. That is, 

r *N <10 - 75} 

According to the BOP algorithm, coefficient correction is performed for every r samples. 

Consequently, compared with a method in which correction is performed for each sample, the 

computing quantity for each sample becomes 1/r. 

In addition, for block Nos. L - 1 and L, one has 

XMiL~l}r}W=M{L-l) r) C10 • 76) 

Here, h N (L) , h N (L+ ^ represent any vectors in the solution spaces of block Nos. L - 1 and L, 
respectively (see Section 10.4). Also, assuming that the solution spaces of block Nos. L - 1 and L 
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are , they are orthogonal to the spaces S** -10 * -S^ 0 where the column vectors of the 

state matrix l}r} $ Xn*{JLr) are set. Consequently, although a little abstract, in Figure 

10.7, one may read as follows: 

x* m ~* 

and one can derive the learning identification method of the block adaptive algorithm edition. 
Assuming that the adaptive filter coefficient used in the Lth block is hjf* , from Equation 
(10.77), one has 

J^(JLr}{m* n -MP}^ML r) (10 - 79) 

Here, e r (Lr) represents the difference between desired signal vector d r (Lr) at the output error 
vector and output signal vector y r (Lr). 

er(Lr) (/>)--» gri&r) 

(10 * 80) 

Numerous hn (L+ 1} exist that satisfy Equation (10.79). However, for the points that make 
orthogonal mapping from 1ln (L) to solution space n(L), I kjf^^kffii is the smallest. 
Consequently, 1in (L + !) becomes the solution of the restrained associated equations: 

Usually, the solution of Equation (10.81) can be represented as follows by using the general 
inverted matrix of Moore-Penrose: 

Here, is the general inverted matrix of the Moore-Penrose form of Xm(JLr r ) . 

BOP is given by applying a step gain to Equation (10.82): 

m^kP^am(m^ALr) {!§ » S3) 

Just like the learning identification method, BOP has the characteristic feature that even if 
computing starts from any initial value, the solution space can be reached in a single round of 
correlation. It is well known that the necessary number of rounds of multiplication needed in 
computing Equation (10.83) is proportional to r N. Because coefficient correction is performed 
for every r samples, for each sample, the equivalent number of rounds is on the order of rN. It is 
believed that by selecting a small value for r, treatment can be effectively performed with current 
available technology. The specific computing method of BOP and its properties is omitted due to 
constraints in space. For details, the reader is referred to references [23] and [24]. Also, in 
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consideration of the influence of the finite length word length operation in practical application, 
usually, better results are obtained if ^w*(Xr)^(Xr) is not calculated too precisely. 



10.5.2 BLMS algorithm 

As described in Section 10.3, for the LMS, in the steepest descending method, as the 
estimated values of the various statistical quantities, one has 

Ajut^xxlJitij^lk} {10*84} 
w xh iMziM (10 * 85} 

and, based on these equations, Equation (10.52) is computed, and, in Equation (10.53), one has 

(16*86) 

Here, the BLMS is obtained using the following relationships: 



(10 * 87) 
00 - 88) 



Here, r represents the block length. In this case, Equation (10.53) becomes: 
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Figure 10.8. Adaptive filter using BLMS. 
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6 Only the initial N points are adopted 

7 Computing of h N (rL + r) 

8 BLMS algorithm 

Here, L represents the block No. In addition, assuming that rf>f-r and the 
algorithm known as BLMS is obtained as follows: 

*» CtC+ r) ^Jfe (rl.) ^ jf "W*) £ (*} 

(10 * 90) 

Here, in rf^^^^C£+i) ""1^ one has 

Usually, the BLMS is used together with FFT. It is well known that filtering in the 
frequency region using FFT has the advantage that the number of rounds of multiplication is 
smaller than that in the time zone [30]. Figure 10.8 is a diagram illustrating the constitution when 
the BLMS is adopted in the signal processing system based on the frequency region. 

In addition, there are also methods that update parameters directly in the frequency 
region. For details, the reader is referred to references (6), (7), (17), (18), (31). Also, the jump 
algorithm for increasing the convergence speed of the BLMS will be explained in Section 10.6. 

(Hajime Kubota) 

10.6 Jump algorithm [27, 32] 

The error curved surface of an FIR type adaptive filter is represented by the following 

secondary form with respect to jsJfc* 35 * fe*^ 

Ble**} ^EhShfBAh*] tB * 92) 

^MlvjAVm) (10 '93) 

Here, the eigenvalue decomposition of the self-correlated matrix of the input signal is taken as 

where v n = V Ah n . It is well known that the eigenvalue A,i is the curvature in the direction of the 
ith principal axis (eigenvector) of the error curved surface [33]. Consequently, the error curved 
surface in the principal axis direction where the energy component of the error is large is steep. 
For such error function, the RLS method as a transformation of the Newton-Raphson method can 
bring the fastest convergence. On the other hand, when the self-correlation of the input signal is 
significant, the error energy is concentrated only in some of the eigenvector directions, and the 
error curved surface is steep in these directions, while it is very mild in the other directions with 
smaller eigenvalues. When such input correlation matrix approaches degeneracy, the RLS and 
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other computing methods become unstable. The reason is as follows: the weighting in the flat 
direction contributes little to the square error, and the overshoot of these weightings cannot be 
suppressed by minimizing the square error, so overshoot cannot be suppressed. Consequently, it 
would be ideal to use different updating step sizes in different directions of eigenvalues. On the 
other hand, when the input signal has high correlation, the gradient vector significantly deviates 
from the direction toward the optimum solution. Consequently, it is difficult to increase the 
speed of a gradient computing method that is updated in this direction. 

In order to solve the aforementioned problems, a jump algorithm has been proposed. 
According to this computing method, updating is performed in the gradient direction, while the 
reciprocal of the eigenvalue of the self correlated matrix of the input signal is used as the step 
size. 

A«— i-T* U "% 

Cli * 95) 

Here, P * represents the estimated value of the gradient vector, and X* % N-^t} i s the 

estimated value of the eigenvalue. In the case of real time treatment, is computed using the 
computing method of DCT, a good approximation of KLT (see Chapter 4), or the like. As time n 
lapses, the step size repeatedly uses in the falling order. A larger one 

also has a longer application time. Ats smaller than a certain lower limit are discarded. 

Also, when it is realized as a block computing method, the following form is taken: 

here, m represents the block No. In this case, it is possible to use fast computing methods, such 
as DCT for determining the eigenvalues and sliding FFT to execute a convolution operation. 
Consequently, the overall computing quantity is similar to that of the block learning 
identification method. 

The jump algorithm is based on the following updating principle. In a FIR type adaptive 
filter, when coefficient vector h m is updated using Equation (10.96), the time correlation matrix 

oi the input signal, » , and its eigenvalue decomposition are taken as 

and one has 

Or-M*) v* (10 * 97) 
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That is, 



1— 



0 



(10 *9S) 



When [m] is used as the step size for updating, the ith component Vi(m) of 

vector v m become zero. This is tantamount to v m going in the gradient direction, and exactly 
stopping at the crossing point between the extension line of the gradient vector and the 
orthogonal supplementary space of the ith principal axis. Consequently, the weighting 
coefficient space becomes partial space Sir- . Such crossing points are always preset in the 
gradient direction, and they are the optimum stop positions for updating v m . Similarly, when the 
reciprocal of other 7^ is used as the step size, v m becomes even lower in size, and it enters the 
orthogonal complementary space between the ith and jth principal axes. Consequently, after 
N rounds of said updating, v m reaches the minimum point on the error curved surface. 




Figure 10.9. Error curved surface and optimum updating position in two-dimensional parameter 
space. 
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Figure 10.10. Optimum updating route in three-dimensional parameter space. 

Figures 10.9 and 10.10 show the optimum stop positions in the gradient direction and the 
updating process through them. Figure 10.9 shows the updating route in the parameter space 
when N = 2, and Figure 10. 10 shows the case when N = 3. 

The following properties of jump algorithms have been found. Because large eigenvalues 
correspond to the steep direction on the error curved surface, small step sizes as their reciprocals 
can bring faster convergence. On the contrary, in the direction corresponding to small 
eigenvalues, the error curved surface is mild, and large step sizes as their reciprocals correspond 
to a slow convergence mode. Consequently, when a small step size is mainly used in updating, 
the convergence is faster, and it is possible to obtain an adapting performance with smaller error 
adjustment. 

10.7 IIR type adaptive filter 

Usually, when a model is formed for a linear system, a form having a rational 
transmission function is the most natural. As a matter of fact, when the impulse length is very 
long, or an unknown system containing damping vibration is taken as an approximation, the use 
of an IIR type filter having a rational transmission function provides a higher efficiency than that 
from a scheme using an FIR type filter having a polynomial transmission function. 
Consequently, research on IIR type adaptive filters has received attention. 

In a formulation with an adaptive filter used in system identification, assuming that the 
input of the unknown system is x n , the output of the unknown system (the desired signal) is d n , 
the observation value of the output signal is z n , the interference and noise contained in the 
observation signal are v n , and the output of the estimation system is y n . In this case, the 
transmission function of the unknown system is 

and the output of this unknown system is 



25 



(10 * 190) 

{» * 101) 

(In the following, as shown in the above formulas, the time-zone display method of the 
transmission function using a polynomial of delay operator z" 1 will be adopted). 

Here, coefficient vector 9, input/output vector cp n , and the minimum predicted error e n * 
are defined as follows. 

" % b *) T Ott • 102) 

{10*103) 

In this case, y n can be written as 

An unknown system in this form is called a linear circuit model [68]. 

10.7.1 Series/parallel structure 

A series/parallel structure is also called the equation error method. According to this 
method, the observation value of the input signal and the past value of the output are used, and 
an adaptive estimation is formulated by predicting the current value of the output. Here, the 
estimated value of coefficient vector 9 of the transmission function obtained at time n is taken as 
9 n? and, with the square average value of the predicted error (or equation error) minimized: 

^ s ft"*^^* '{10*166) 
^•A*00v*--Bm(»)3* {IS - 107) 

the coefficient 9 n at time n of the adaptive filter is updated with the constitution shown in Figure 
10.11. 
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Figure 10.1 1. Series/parallel type IIR adaptive filter. 



In the aforementioned equation, the structure does not contain feedback of the signal to 
the estimation system itself, and it is not a genuine IIR type. Consequently, the square error 
becomes a convex function of estimation coefficient B n , and the estimation computing method is 
also similar to that of an FIR type adaptive filter, with the gradient computing method and least 
square method adopted for it. 

In a gradient or LMS series/parallel adaptive filter, the estimation coefficient 0 n is 
adaptively updated as follows. 

(10*108) 

Also, a system has been proposed in which the coefficient vectors of the numerator and 
the denominator polynomials are updated individually. 

d*-i+ w&r* (10 * 110* 

The conditions for stability of the various step sizes have been studied [45]. 

When the RLS method is adopted in an IIR type adaptive filter, by replacing coefficient 
vector h n with 9 n and replacing the input vector x n with input/output vector cp n , the algorithm 
described in Section 10.2 can be adopted [68], [70]. 

For estimation of the aforementioned obtained transmission function, when noise can be 
ignored and the persistently excitation conditions are also met, it is possible to guarantee 
approach to the true transmission function or the optimum estimated value most appropriate for 
its approximation [[illegible]]. Because of major advantages such as guarantee of convergence 
and fast convergence, the series/parallel type IIR computing method is now widely adopted in 
automatic equalizers, echo cancellers, etc. [40], [41], [70]. 

However, in application of an IIR type adaptive filter, in some cases, not only may simple 
estimation of the transmission function be obtained, but also realization of an approximate 
system of the unknown system by means of an estimated transmission function. In such case, 
stability of the estimated transmission function should be guaranteed. This requires that the roots 
of the denominator polynomial be within a unit circle. Usually, no such guarantee for the 
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estimated transmission function obtained using the aforementioned method can be made. As a 
method for guaranteeing that the roots of any denominator polynomial A n (z) be within a unit 
circle, there is the Schur-Cohn method. However, in this method, computations on the order of 
0(N 2 ) are required. The computing quantity is much larger than the 

computing quantity order of magnitude for estimation. Consequently, it usually becomes the 
bottleneck for real time processing, so methods to increase the speed of the aforementioned 
Schur-Cohn method have been investigated [illegible]. 

10.7.2 Extension of series/parallel computing method and various identification methods 
A disadvantage of the aforementioned series/parallel computing method is that a 
deviation (bias) takes place in the estimated value of parameters when interference noise v n 
exists. In order to prevent the aforementioned bias, the following schemes have been proposed as 
system identification methods for determining an unbiased estimation of parameters: extended 
least square method (ELS), generalized least square method (GLS), recursive maximum 
likelihood estimation (RML), and instrumental variable method (IV) (see Chapter 9) [37], [68]. 
Among the aforementioned methods, for ELS, GLS and RML, in order to display the properties 
of the interference noise, a noise generation model is assumed for estimation. These methods and 
IV are included in more general methods for estimating linear recursive models, that is, the 
recursive prediction error method (RPEM) and pseudo linear regression (PLR) [65]. 

Here, as the unknown system, the following system is taken in a general linear regression 

model. 

Here, v n is taken as white color. For such unknown system, if one defines as follows: 

**w K&f 

'■H^SH' " 00; 1*2) 

0~ t*\ *\ <?* d\ f*) r ' (10 * W3) 

#,*-(- ffl, xj, -Si el. - RJ>* <1« ' -IMS 

- ctMu -afar* 

output prediction y n and its prediction error using estimated value 9 n of coefficient vector 9 
become the following: 
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(10* 117} 

For such model, the basic algorithm of the recursive prediction error method is as follows: 

Also, the PRL method becomes 

In Equation (10.111), when A(j*}«Fta} s*CU> « j0Oe>«l, the aforementioned 
computing method becomes the RLS method. When FUft^Dfe}—!, the RPEM method 
becomes the maximum likelihood estimation method RML, and the PLR method becomes the 
ELS method. Also, the GLS method corresponds to the case of - G(*> -1, and the HARF 
and SHARF method to be described in the next section corresponds to the case of 

In order to estimate the parameters of the aforementioned complicated models, it is 
necessary to include a closed feedback loop in the estimation system, and to solve a general 
nonlinear optimization problem. In the aforementioned method, the local optimum solution is 
determined by means of the probability Newton-Raphson method. Its algorithm is basically a 
combination of the recursive least square method and the boot-strap method. Consequently, its 
convergence value also depends on the initial value. In the analysis of convergence of the 
aforementioned algorithm, the ordinary differential equation (ODE) method, the Liapnov 
function method, etc. are adopted. 

In addition, the instrumental variable method that actively uses the property that noise is 
not correlated to the input signal can be used. Because an error curved surface is a secondary 
function of the parameter, one may simply solve a normal equation that has been extended in 
estimation, and its solution can be uniquely determined. This is a major advantage over the 
aforementioned methods. This computing method has the same form as that of the RLS method 
when the inverted matrix rama [transliteration] is used to obtain a recursive solution. This system 
is now adopted in several fields of adaptive signal processing [58], and the gradient instrumental 
variable method is also adopted in echo canceling [42], [45]. 

10.7.3 Parallel structure 

The parallel structure is also called the output error method (OEM). It differs from the 
series/parallel structure in that an IIR type estimation system is formed, and the estimation 
computing method is derived by minimizing the error between its output ¥* and the output y n of 
the unknown system: 
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Figure 10.12 is a diagram illustrating its constitution. 
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Figure 10.12. Parallel-structure IIR type adaptive filter 



[1] Updating computing method by means of gradient method 

When the gradient method is used to minimize the square of the output error with respect 
to 8 n , the parameters are updated as follows. 

where, the components of the gradient vector are defined as follows: 

^ <*> ^l^fep P-h% *■% N £10 * 12$ 

Mid ^ g^p 1* ~% M {10 • 134) 

Usually, if the adaptive updating is sufficiently smooth, the following approximations can be 
adopted: 

(Iff -126) 

and the following recursive equations of the gradient vector can be obtained: 

a^t») jwtjg.* (»)^(x-~i) Clfi * 12?) 

£ Mn)0,(n-~i) (10 * 12© 

However, when the denominator polynomial is Ami^T 1 } , the time -varying filter is 
contained in the adaptive filter itself and in computing the gradient. Consequently, if it is 
unstable, the adaptive filter also becomes divergent. Consequently, in order to guarantee 
stability, if the roots of obtained at time n are over the unit circle, the new estimated 
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value is discarded, or the roots are pulled back into the unit circle followed by updating with the 
methods proposed in [38], [39]. 

On the other hand, because it is easy to judge the stability of a secondary IIR filter, a 
scheme to include it in the structure has been proposed. For example, in a scheme proposed in 
[39], a the secondary parallel structure is prepared by parallel connection of secondary 
consecutive structures prepared by sequentially connecting secondary IIR filters (biquads). 

In addition, a structure using a secondary IIR filter having a fixed pole for guaranteeing 
stability [44] has also been proposed. The positions of the fixed poles are predetermined 
according to knowledge obtained in experiments. In addition, a structure using a total polar type 
lattice filter has been proposed [39]. 

[2] Output error method by means of hyper-stable theory 

The gradient method has the disadvantage that the convergence speed is low. In order to 
increase the speed of convergence, adoption of the Newton-Raphson method has been 
considered. However, because the error curved surface of the estimation system containing 
feedback is a complicated nonlinear function, it cannot be approximated as a secondary function. 
Consequently, local convergence cannot be guaranteed when the RLS method and other methods 
are adopted in starting from any initial point. 




Figure 10.13. Structure of HARF computing method (courtesy Larimore A., et al., Copyright 
©1980 IEEE). 

Key: b Delay updating 

In order to solve the aforementioned problem, a hyperstable adaptive recursive filter 
(HARF) and a simplified HARF (SHARF) have been proposed as adaptive updating computing 
methods that guarantee the local convergence property (or large-area stability) by using the 
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hyperstability theory known as the general stability theory in a nonlinear time-varying system 
[47], [48]. Figure 10.13 shows the constitution of this scheme. 

For the estimation system, the output ^ and the instrumental output f n are as follows: 

/.^^.^{a+D/^f+gfet^+D^-/ {10-130) 
Also, the output error and the smoothed error are defined as follows: 

*t §* (18-131) 

f*~ J| ci [y—i-, ~f*~f~il (16 * 132) 

Here, the following filter is called the smoothing filter: 

C(jd ^i+jEjc^* ClO • 133) 

According to the HARF computing method, the coefficients are updated as follows: 

l<&ii£N (10 • 134) 

M«> = Ar{»-l)-i-f L ^^ I Cc*-f- g»L 

Here, q n is taken as the normalization factor, and it is obtained as follows: 

When the smoothing filter C(z) meets the following strict positive realness (SPR) and an 
appropriate initial value of the coefficient is selected, convergence of the HARF computing 
method can be guaranteed. That is, when G{z)**C(z)/A(sd a the following relationship is 
established: 

When the initial value is also appropriate, the error quantity 

¥*~f*+ tlfa^w-A-i^l tm v 138) 

converges to 0, and one has S^— Z«^J&*. Consequently, in practice, the following relationship is 
established: 

10.8 Adaptive lattice filter [50]-[52] 

Itakura and Saito have shown that the adaptive lattice filter (Figure 10.14) can realize the 
Levins on-Durbin computing method as a linear prediction computing method (see Chapter 8) 
[66]. This structure has the advantage that the sensitivity of the filter coefficient is low, and the 
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numeric stability is also high. Also, it is simple to guarantee the safety of the total polar structure. 
Due to these and other advantages, it is now in wide use in various fields of adaptive signal 
processing [48]. 




Figure 10.14. Adaptive lattice filter. 

10.8.1 Adaptive linear prediction and lattice type filter 

First, the pth-order linear prediction problem is formulated. Linear prediction of the 
current value using p past values of signal x n is as follows: 

#/Crt^~i-^^«/fe>^* (10 * 140) 
This prediction is called forward prediction. On the other hand, the prediction of x n _ p using 

X n , X *~* 

CIO • 141} 

is called backward prediction. Here, signal vector x, the forward prediction vector and the 

backward prediction vector are defined as follows: 

X**~P ~ (Xa, £*~t> ~ ~, T (10 * 142} 

^W^aA'W,-, a/(n}) T (10 * 143) 

and, the forward and backward prediction errors are as follows: 

= £p#^Aj£n}x+»-p (10 * 1455 

=jr*->- f^*ft r Wajhi (to * 1M 

For the lattice structure, the forward prediction error e p (n) and the backward prediction 
error r p (n) are defined as the following order-updated form: 

epm (f») — ## (*t> ^ £&» *>{*£^ 1) (10 * 1471 

where are reflection coefficients. 

10.8.2 Adaptive realization of Levinson computing method 

As methods for realizing an adaptive lattice filter, the block computing method and 
recursive computing method exist. For computing the reflection coefficient, a few methods have 
been proposed, that is, a method in which the statistical average in the Levinson-Darwin 



33 



[transliteration] method is replaced with the time average, and a method in which update is 
performed using a gradient method to obtain a minimum prediction error for each section. 

First, the following computing method can be adopted to compute a reflection coefficient 
having the minimum sum of squares in the block of forward and backward prediction errors [68]. 




Itakura and Saito use the geometric average of the aforementioned two coefficients in 
computing a reflection coefficient [69]: 

kp computed in this way is called the PARCOR coefficient, and, because its absolute 
value is always smaller than one, the minimum phase property of the filter transmission function 
is maintained. In addition, in order to maintain the minimum phase property, another method 
uses the minimum values of 

The reflection coefficient corresponding to the minimum sum of squares in the block of 
the forward prediction error and backward prediction error is as follows: 

**** ~^p$^ Ct» * 1S2) 

and this becomes the harmonic average of [67]. 

On the other hand, a gradient recursive computing method has been proposed as a 
scheme for real time updating of the reflection coefficient in (10.149) and (10.150) [69]: 

k£*(n+ 1) {n) +fir P (m-i) ^ ( n y 

{10 - 153) 

ijfM{«+l)=*Ai(») +^^.»{»)^C«) {10 * 154) 
or, even simpler forms may be used: 

[n) («> . 1S5 ) 

+ r*+i (n)ep(n)) (iq - 155} 

To normalize them, in Equations (10.153) and (10.154), |J. is replaced by 
$/W (it), fi/Xf { n ) to becorne: 

i?*.*(*H-l} (10 . X58) 

Also, in Equation (10.156), another scheme uses instead of ja: 
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+ (k) ep(n} ] (10 * 359) 

fto - ago) 

In addition, the recursive updating method of reflection coefficient (10.152) is as follows: 

IW * MI) 

Cm • i«2) 

and R p (n) is updated as shown in Equation (10.160) [49]. 



10.8.3 Least square lattice type filter 

For least square estimation, it is necessary to solve a normal equation. In this case, the 
operation for determining the inverted matrix of the co-dispersion matrix of the signal plays a 
central role. The Gaussian elimination method and other methods require a computing quantity 
of 0(N ). In consideration of this problem, Levinson has proposed a fast computing method for 
updating in the order of 0(N 2 ) that uses the property that the co-dispersion matrix in the weak 
steady state process is a Toeplitz. This is realized by means of lattice type filters (least square 
lattice filters (LSL)). In addition, in the adaptive processing, the shift-low-rank property of the 
sample co-dispersion matrix is exploited by Morf in presenting a fast computing method in the 
order of 0(N). This method has since become the basis of the fast least square computing 
method. The transversal type fast RLS computing method now available is numerically unstable. 
On the other hand, an LSL computing method based on the same scheme yet having stable time 
updating and with the same order of computing quantity as that of the gradient method [55] has 
been proposed. 

First, assuming that -^#C**)# Mp {m) i s the optimum prediction vector for realizing the 
least square estimation, the following normal equations are established. 



* 0 



(10 • 163) 
(10 * 164) 



Here, &/(n) represent the least square errors of the pth-order forward and 

backward prediction errors. 
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In the following, an explanation will be given regarding a method in determining the 
solution of the (p + l)th-order linear prediction at the same time (order updating) that the 
solution of the pth-order linear prediction at time n is known, and a method in determining the 
pth-order prediction at time n + 1 (time updating) when the pth-order prediction at time n is 
known. 



(1) Order updating 

The order updating of prediction vectors Bp{n) \ s performed as follows. 

(10 • iSS) 
(10 * 166) 

where, 

(19*1685 

j3p(n) i s the time mutual correlation function of the forward and backward prediction 
Here, one has 

a$+i (n) -f£+i in) Am (n) (10 * 110) 

Consequently, the lattice computing method for order updating of the forward and 

backward prediction errors is defined as follows: 

&+t(n)rp(n-ti (10 * 1713 

r P +ilm} ^rp(n-l>-»J&+%<**)eM (10 - 172) 



errors. 



(2) Time updating 

First, the likelihood variable j s defined as follows: 

y* (*} (10 • 173) 

It has the order updated as follows: 

»«<*>-»<»>-Hjfe{§- {l ° - 174 > 

Each time a new signal vector is input, the various variables of the aforementioned 

lattice computing method are time updated as follows: 
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4^ C*H-i> ~JkM + 

(10 » 175) 

«r<«+a)-o/W+f^^f «i • 176} 

, r,»{«-fl) 

(10 • 177) 

As can be seen from comparison between the gradient lattice computing method and the 
LSL, although both computing methods compute a mutual correlation function of the forward 
and backward prediction errors, they are different from each other as follows: while in the 
gradient method, the weighting of the new signal at various times is uniform, in the LSL method, 
weighting coefficient 1/(1 ***•)>{»)) containing \ s use d. Because one has 0*I»{«}*1, 

for the self correlation information of the new signal vector, the more the new component as 
compared with the co-dispersion matrix of the signals that have been accumulated, the nearer 
Tp{n) is to unity. Consequently, when a signal having new information is input, its weight 
becomes very large, so fast convergence is possible. 




Figure 10.15. Normalized LSL adaptive filter 



(3) Normalized LSL computing method 

When the aforementioned LSL computing method is normalized, it is clear that 
computing of the square root is introduced. As a result, a concise computing method can be 
obtained for updating the three parameters (normalized reflection coefficient normalized 
forward and backward prediction errors 9^C«)) ) (Figure 10.15). 
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(10 * ITS) 



(10 * rm 



(10 - 180) 



(4) Joint estimation by LSL 

The lattice adaptive filter is a computing method developed for linear prediction of a 
signal. However, for the problem of processing of an adaptive signal that can be formulated as 
FIR type system identification, the past value of input signal x n is used to predict the desired 
signal y n . That is, joint estimation is necessary. 




Figure 10.16. Joint estimation using lattice type ADF. 



In this case, when the aforementioned LSL computing method is adopted, the least square 
joint estimation method can be realized easily. Its constitution is shown in Figure 10.16. More 
specifically, instead of using the past value of input signal x n , the backward prediction errors 
i of the orthogonal LSL are used, and y x is predicted by means of a linear combination 

of them. Assuming that the prediction error is as follows: 

t*(n) =zv*-jgcp («) n (n) (10 * 181) 

weighting coefficient cp(n) is updated as follows. 

dpin) ~e£pin-l) ^^f^^^ (10 • Wfi 

CpXn) 



(10 * 183 ) 
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10.9 Constitutions of other adaptive filters 

10.9.1 Constitution with whitening of the input signal 

In order to overcome the problem of the gradient method in that it has a low convergence 
speed with respect to a colored input signal, a method has been proposed in which the input 
signal is first converted to white color before input to the adaptive filter. As a scheme for 
whitening, the Gram-Schmidt orthogonal method is often adopted. Also, there is a scheme in 
which the adaptive lattice filter is used to output the weighted linear coupling of the orthogonal 
backward prediction error {r p (n)} [57]. Here, the weighting coefficient is updated using the 
conventional LMS method. 

10.9.2 Fast computing method of least square method and systolic array 

Recently, in order to further increase the operating efficiency, a fast RLS computing 
method has been proposed, that is, FTF (fast RLS transversal filter) [58]. However, it has been 
pointed out that this method is numerically unstable for other than the lattice fast computing 
method (LSL). Consequently, even LSL should not be used for higher order [62]. On the other 
hand, research has been performed on a scheme in which plural processors are used, and a 
parallel processing and pipeline processing are combined to increase the overall throughput in a 
systolic array method or the like [51]. 

10.9.3 ARMA lattice filter and identification method by means of embedding 

The aforementioned lattice filter has been adopted in predicting in the AR process (see 
Chapter 8, Section 8.5). Also, a lattice structure that can also predict the ARMA process has been 
studied (see Chapter 8, Section 8.8) [52]. More specifically, a scheme has been adopted for 
predicting the 2-channel signal obtained by embedding input/output signals. 

Especially, according to the ARMA Levinson computing method, prediction of a multi-channel 
signal can be obtained by using a recirculation lattice filter [[illegible]]. 

Also, when the problem of identification of an IIR system is converted to the problem of 
prediction of 2-channel signal x n , the fast computing method of the multi-channel least square 
estimation and a lattice filter can be used [59], [60], [70] (in consideration of the causal 
relationship of the modeling, (w*-r* or another delayed embedding must be used [60], [61]). 
In addition, by correcting the multi-channel maximum entropy computing method, it is possible 
to estimate the minimum phase of the denominator polynomial of the transmission function, and 
a pseudo unknown system with stability guaranteed automatically can be determined [62]. 
Especially, the constitution of a LATIN (lattice-inverse) structure formed by sequentially 
connecting a lattice filter and its inverse filter is effective in the applications of IIR-shape echo 
hauling cancellation, etc. [63], [64]. 
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10. 10 Adaptive array 

An adaptive filter is often used in processing signals taken as a sequence in time. Also, 
the signals of an adaptive array are processed as a time-space sequence by means of space 
characteristics together with time characteristics. For example, two signals having different space 
characteristics are highly correlated to each other as time sequences, and, when their frequency 
bands superpose one another, it is hard to separate them with an adaptive filter. However, when 
an adaptive array using the difference in space between two signal sources is adopted, it is 
possible to separate said signals. Consequently, the present technology is now widely adopted in 
various fields, such as radar, sonar, wireless communication, acoustic processing, as well as in 
geological surveying, astronomical measurements, etc. 

As shown in Figure 10.17, a typical structure of a wire-like adaptive array comprises an 
array of sensors and a conventional complex coefficient adaptive filter connected together (for 
narrow band signals, usually only a weighting is connected to each sensor. However, in the 
following, as a general structure, it is assumed that the structure is for wide band signals (wide 
band array) having FIR type adaptive filters connected to the sensors). 




Figure 10.17. Structure of a wide band adaptive array (courtesy Window B. et al. ? Copyright 
1967 IEEE). 

Key: f Sensor 
g Output 

Assuming that the received signal of the sensor is Xi(n), the output signal of the array 
becomes: 

tin) «|[jf[ w#**t«-/+l) (10 * 1B# 
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Here, the coefficient vector and signal vector are defined as follows: 

mm) r {10 * IBS) 

J?*** (xi(tt), j^C^^^I)) f (10 * IBS) 

As a result, one has 

Here, w M is the complex conjugate of . 

In the following, the steering vector that represents the intrinsic characteristics of the 
array will be defined. For a complex plane wave of an incident signal with an arriving angle (the 
angle with respect to the normal direction of the array) of 9 and at a frequency of go, the steering 
vector becomes 

where, ti(6) represents the delay in the signal from the first tap to the ith tap. 

The steering of the array is represented by the inner product of the coefficient vector and the 

steering vector. 

rid, m) -n^diff, m) (MS* i£S} 

Consequently, by appropriate adjustment of the relative direction of coefficient vector oo 

and the steering vector, it is possible to suppress or extract an input signal in a prescribed 

direction and at a prescribed frequency. 

In the following, an explanation will be given regarding several computing methods 

wherein the coefficient vector oo is defined adaptively. Just like conventional adaptive computing 

methods, they can be classified into a block type and recursive type. 

10.10.1 Side lobe canceller 

Side lobe cancellers each comprise a main canceller and a secondary canceller [71], [72]. 
The space and frequency characteristics of the principal channel are designed such that the 
desired signals can be extracted, and an unknown interference signal contained in the output is 
removed by adaptively adjusting the weighting of the secondary channel. Assuming that the 
input of the secondary channel is x c , secondary array coefficient w a is defined such that the 
following average electric power of the output y(n) of the entire array reaches a minimum: 

Mi\9l*m m £l\9m{n)~w&*(nm (10 * 190) 
Just as in the case of derivation of the FIR type adaptive filter, a determination is possible by 
solving normal equations: 

M^Eixtxf), r„=M(»$(n)x.} (10-192) 
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As a more specific method for realization, in the case of the block type computing method, the 
collection average is replaced with the time average in the block. In the case of the recursive 
computing method, one may adopt the LMS method or RLS method. 

If the desired signal is relatively weak, it is possible to improve the SN ratio using this 
scheme. However, when the desired signal is more intense than the interference signal, the 
desired signal itself might be eliminated. Usually, the following method is adopted: if a desired 
signal does not exist, the coefficient of the secondary channel is trained by means of adaptive 
updating, and, if a desired signal exists, the adaptive updating is stopped. 

10.10.2 Method using reference signal [[illegible]] 

In practice, when information pertaining to the desired signal ya(n) is used, it is possible 
to determine a reference signal y T (n) similar to the desired signal. In this case, the coefficient of 
the secondary array of the side lobe canceller can be defined such that the following square error 
of the output and the reference signal reaches a minimum: 

That is, 

When this computing is realized adaptively, just as described in the preceding section, 
one may adopt the standard LMS method or RLS method. 

When the reference signal is generated, one may only have knowledge about the mutual 
correlation between the desired signal ya(n) and the input signal. That is, when the following 
relationship is met: 

Tm^r** (10 *197} 

the optimum coefficient can be determined using the aforementioned computing method. 
Consequently, usually, information about the arrival direction of the desired signal is not 
required. 

10.10.3 Linear constrained minimum variance array [73] 

Usually, for an input signal in a prescribed direction and at a prescribed frequency, it is 
convenient to assign the answer of the array in advance. This gives the constraining condition 
when the array coefficients are determined. The array coefficients are determined such that the 
average power of the output reaches a minimum under the constraining condition. For example, 
for an input signal with arrival angle Go and at frequency coo, assuming that the steering vector of 
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the array is &fc)=£ 5 an d the answer of the array is to be set at &P&=f 9 one may 

minimize the next evaluation function by means of the Lagrange multiplier method. 

The obtained array coefficient becomes the following: 

In addition, general constraining conditions, such as assignment of the zero point and 
fixed gain with respect to plural signals and control of the bandwidth of the assigned gain using a 
constraining condition with respect to differentiation, etc., are represented by associated linear 
equations, and are given by the following constraining matrix C and gain vector g. 

c«w^ff (io * mm 

Under such constraining condition, the evaluation function 

Lc^wma&*M€?w-g} OM * 201) 

is minimized, and the obtained array coefficient becomes 

»±&- t ClC*8S l Cf** m - 202) 

The aforementioned optimum coefficient can be realized by means of a block type 
adaptive computing method. Also, it is possible to realize the recursive form by means of a 
gradient method for minimizing L c . For example, under the single constraining condition 
W~C—fi 9 updating of the coefficient is performed as follows [75]: 

Here, P is the orthogonal mapping function element with respect to the orthogonal 
complementary space of c. 

Consequently, updating of the coefficient is performed in the orthogonal mapping 
direction to a region that meets the constraining condition of gradient ic*Hfc- with respect to w n of 
SQvOdft. Here, R x is approximated by and, by using « lff*{tt)jr{*}, the 

following updating computing method is obtained [73]: 

Also, with plural constraining conditions, the adaptive computing method may be adopted just as 
aforementioned. 

When the linear constrained minimum variance array (LCMV) is formulated as follows, 
as a natural extension of the side lobe canceller, a generalized side lobe canceller (GSC) is 
obtained [74]. 

First, when the coefficient space is orthogonally decomposed to the value zone space and 
zero space of restrained matrix C, coefficient vector w becomes the sum of component w c in the 
value zone space of C and component -v in the zero space of C. 
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w c is uniquely defined by the constraint condition: 

We^tCm^Ge (10*207) 
Also, assuming that the matrix obtained by taking the base vector of the zero space of C 
as a column vector is D, it is represented as ^^ZJwj*. Consequently, the topic of estimation 
becomes a topic for determining w D . This can be regressed to a constraint-free miniaturization 
problem. That is, w D is determined so that the following reaches a minimum: 

The solution obtained using this block computing method is as follows: 

Also, if the recursive computing method is adopted, the following gradient computing method is 
used: 

Vm+i=Wk—f*B*xtn)9*{n) (10 * 2105 

(J. Chao) 

10.11 Cancellation of noise 

In the following, an explanation will be given regarding an adaptive echo canceller and 
an adaptive noise canceller as examples of application of adaptive signal processing. Echo 
cancellation and noise elimination have the basic idea that instead of estimating the waveform 
itself, the transmission function (impulse response) of a certain linear system preset in the path of 
the waveform is estimated. In other words, one should understand that by estimating a parameter 
of the unknown system, it is possible to eliminate a waveform not in use. Also, in practical 
application, matching the estimated state of the parameter, the step gain, etc., of the adaptive 
algorithm are adjusted (see the concept of an intelligent adaptive filter [76], [77]), and it is 
important to add a function for quick coping with noise, variation in an unknown parameter or an 
increase in bias. 

10.1 1.1 Adaptive echo canceller (see Book 4, Chapter 5, Section 5.3) 

When a long distance telephone line, such as one for international calls, is used, after a 
few seconds, the voice of the sender will be heard by the sender himself/herself from his/her own 
receiver, so conversation becomes difficult. The delayed voice of the caller himself/herself is 
known as echo. The cause for generation of echo is mismatch in impedance of a hybrid circuit 
set at the connecting portion of a 2-wire line and a 4-wire line. In conventional technology, 
silence of the counterpart is detected, and a switch known as a voice switch is used to lower the 
gain of the channel where the voice of the counterpart is sent, so the echo problem is suppressed. 
However, this system cannot suppress echo (known as double talk) when the counterpart speaks 
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at the same time as the caller, and switching of the aforementioned voice switch leads to an 
unnatural sensation. Due to these problems, there is room for further improvement. 

With this background, a system has been proposed with the following features: the 
parameter estimation concept is used for adaptive estimation of the echo path, a pseudo echo is 
generated and it is subtracted from the original echo so that the echo problem can be suppressed. 
This system is called an adaptive echo canceller. Figure 10.18 is a diagram schematically 
illustrating an echo canceller. As shown in Figure 10.18(b), if the transmission function from 
point A to point B (mainly the impulse response) can be estimated, the pseudo echo can be 
obtained. 
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Figure 10.18. Schematic diagram of adaptive echo canceller. 
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10.1 1.2 Adaptive noise canceller 

Consider the case when the desired signal is buried in noise. Signal processing 
technology for minimizing the influence of noise on a desired signal has an extremely wide 
application range, and it is an old topic but a new research theme. Here, if only the noise can be 
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retrieved, this topic can be addressed at high efficiency by means of the concept of an adaptive 
noise canceller. Figure 10.19 is a schematic diagram illustrating an adaptive noise canceller. 
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Figure 10.19. Schematic diagram illustrating an adaptive noise canceller. 



Key: a 
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Point B 
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Here, point A and point B are known as the principal input terminal and reference input terminal, 
respectively. The principal input terminal is a conventional input terminal, where the waveform 
as a sum of desired signal s(k) and noise d(k) is input. When the desired signal s(k) is not mixed 
in the reference input terminal, by subtracting the waveform at the reference input terminal, 
which has been subjected to appropriate linear treatment, from the waveform at the principal 
input terminal, it is possible to retrieve only the desired signal. In this example, when 
transmission function H(z) of the adaptive filter is equal to the path transmission function K(z), 
the output of the adaptive filter is equal to the noise waveform at the principal input terminal, and 
it is possible to completely cancel the noise by subtraction. 

(Hajime Kubota) 
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Chapter 5. Digital communication systems 

... is required, and the processing is complicated. For the transmission line code, in order to 
reduce the influence of near-end crosstalk, it is necessary to attempt to reduce the transmission 
rate. In the U.S.A., the 2B1Q (2 binary 1 quaternary) code of the 4-value code (transmission rate 
of 80 kbaud) has been standardized. 

For a subscriber line transmission system, without international unification, both the time 
division direction control transmission system proposed by Japan in the appendix of CCITT 
recommendations and the echo canceller transmission system proposed by North America and 
Europe will be described. 

[2] Structure and realization 

DSU comprises terminal circuits of waveform equalization and timing extraction, etc., 
and interface circuits with terminals. Among the structure circuits, the functions and operations 
of an automatic equalizer and echo canceller in which digital signal processing technology is 
adopted will be explained. 

For a subscriber line of metallic cable, a loss occurs that increases in proportion to the 
square root of the frequency. Consequently, a Vf equalizer should be present for compensation. 
The compensation quantity is defined according to the maximum line loss. For example, for a 
subscriber line in Japan, with a line loss of 50 dB (160 kHz), it can be included in 99% of the 
subscriber line distribution that has been set. Also, for a subscriber line, in consideration of the 
convenience of wiring, when there is a demand for telephone service, a branching line known as 
a bridge tap is set. However, because the tip of the bridge tap is released, reflection leads to 
distortion in the waveform. In order to compensate for this, a bridge tap equalizer is needed, and 
a judgment feedback type equalizer is used. The aforementioned automatic equalizer is needed in 
both a time division direction control transmission system and an echo canceller transmission 
system. Usually, it is realized with an analog circuit in the former stage of the A/D converter of 
the Vf equalizer. 

In an echo canceller transmission system, in order to reproduce data from a received 
signal that has been attenuated by 40 dB or more, an echo canceller having an echo canceling 
property of 60 dB or higher is needed. The echo canceller is usually of the FIR type (about 30 
taps). When 2B1Q containing a DC component is used as the transmission line code, in order to 
suppress the long tail of the echo generated due to influence of the DC cutoff of the transformer 
of the 2-wire/4-wire conversion hybrid, a combination of an FIR type and an IIR type (primary - 
secondary order) is adopted. In addition, when the echo path has nonlinear characteristics, it is 
necessary to suppress nonlinear echo. However, it is impossible to suppress nonlinear echo in an 
FIR type based on conventional linear operation. Consequently, usually, a RAM table type is 
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used at the same time. According to the RAM table system, the transmission pulse pattern is 
taken as the address, and the echo level is accumulated so that a pseudo echo is generated. In this 
system, it is possible to suppress nonlinear echo. However, in order to cut the necessary RAM 
capacity, a scheme for dividing the RAM is needed [[illegible]]. 

In order to reduce the size and to lower the power consumption of the DSU, it is 
necessary to form an LSI for the digital circuit and analog circuit including the automatic 
equalizer and the echo canceller. 

(S. Yamasaki) 

5.3 Echo canceller 
5.3.1 Generation of echo 

One's voice goes through the line on the counterpart side and then returns to the caller 
himself/herself, and it is heard by the caller as an "echo". This phenomenon is known as caller's 
echo (hereinafter to be referred to as echo). This echo occurs and hampers conversation in an 
international call or another call with a significant delay in transmission and in a voice-amplified 
conversation using a speaker/microphone (hands-free call). Also, when echo is generated at the 
two ends of a line, a closed loop is formed via the communication network, and, when the loop 
gain is greater than one, oscillation (hauling) takes place, and communication becomes 
impossible. 

Figure 5.14 is a diagram illustrating the constitution of an analog telephone network 
circuit. For a subscriber line that connects a subscriber and the local exchange, in consideration 
of economy, a 2-wire line having the sending signal and the receiving signal superposed on the 
same pair of wires of the line is adopted. On the other hand, for a long-distance line connecting 
exchange facilities of different cities, in order to compensate for transmission loss and to use 
plural lines at a higher efficiency, two pairs of transmission lines are respectively used for the 
sender side and receiver side to form a 4-wire line system. A 2-wire/4-wire converter that 
connects said 2-wire line and 4-wire line is a hybrid transformer. 
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Figure 5.14 Constitution of telephone line. 



Key: 1 Subscriber line (2 -wire type line) 
Local relay line (4-wire line) 
Subscriber line (2-wire line) 

2 Sending signal 
Receiving signal 

3 Echo 

4 Receiving signal 
Sending signal 

5 Subscriber 
Telephone station 
Telephone station 
Subscriber 

6 Hybrid transformer 



The aforementioned hybrid transformer is designed for impedance matching between the 
2-wire line and the network for balance, so that the receiving signal on the 4-wire line side does 
not detour and enter the sending signal on the 4-wire line side. However, different subscribers on 
the 2-line side have different types of lines and different lengths, so impedance mismatch takes 
place with the network for balance, and thus a portion of the receiving signal on the 4-wire line 
side flows into the sending signal on the 4-wire line side, leading to generation of echo. 

Echo canceling quantity and echo canceling time 

The echo canceling quantity (ERLE) is a quantity indicating the attenuation of the echo 
quantity, and it is defined as the ratio of the residual signal power ((«(/})' in Figure 5.18) to the 
echo signal power in Figure 5.18). Also, the echo canceling time is the time until a 

receiving signal is again input as a sending signal via the echo path. Figure 5.15 is a diagram 
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illustrating an example of measurement of a one-cycle delay time and the necessary echo 
canceling quantity [23]. Here, one-cycle delay time refers to the time for the sending signal of 
the caller to reach the callee and then return to the caller as an echo. As explained above with 
respect to the definition of echo, a one-cycle delay corresponds to the echo, so the desired echo 
canceling time is proportional to the overall one-cycle delay time of the entire communication 
system. For the echo canceling time required for a satellite line and international line, in 
consideration of the echo canceling in a particular country, it is set at double the longest 
domestic transmission delay time. For example, Tokyo and Ogasawara Islands are connected via 
a satellite communication line, and the echo canceling time is 60 ms. Here, the echo canceling 
quantity is determined to meet the following application. The one-cycle delay time for an 
international line with a satellite line is about 250 ms and that with a US-Japan marine cable line 
is about 150 ms. Because the loss quantity for only a conventional domestic transmission system 
is insufficient, the echo canceling quantity is defined as 30 dB according to CCITT 
Recommendation G. 165. Also, along with the introduction of the domestic relay system of 
digital optical fiber transmission lines in recent years, due to an increase in the transmission 
delay and a decrease in the transmission loss, an echo canceller has been introduced to the lines 
with a one-cycle delay time of about 60 ms or longer. 
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Figure 5.15 Required echo canceling quantity in line echo (telephone system). 

Key: 1 Required echo canceling quantity 
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Figure 5.16 Structure of voice-amplified communication system. 
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Figure 5.17 Required echo canceling quantity in acoustic echo (voice-amplified system). 



Key: 1 Required echo canceling quantity 

2 One-cycle delay time 

3 In-room reverberation time 

4 In-room noise level 



Figure 5.16 shows the structure of a voice-amplified communication system using 
microphones and speakers. The voice of the caller is regenerated by a speaker on the caller side 
as echo after going through a microphone and with a certain delay time. Figure 5.17 shows the 
results of measurement obtained for the required echo canceling quantity (echo tolerable limit) 
versus the one-cycle delay time [24]. It differs from the case of line echo in that echo should be 
cancelled even if the one-cycle delay time is approximately tens of ms due to the influence of the 
acoustic echo path characteristics. Also, because the echo canceling time is the impulse response 
length including the absolute delay time from input to the speaker to output from the 
microphone, the echo canceling time is longer inside a room due to a longer reverberation time. 



55 



53.2 Structure and operation of echo canceller 

The echo canceller operates as follows: the transmission characteristics of the echo path 
are estimated to form a replica of the impulse response, which is then superposed on the 
receiving signal to form a pseudo echo signal that is then subtracted from the true echo signal to 
cancel the echo. Because the transmission characteristics of the echo path vary over time, for the 
pseudo echo circuit, an adaptive type that determines the replica of the impulse response at all 
times is used, and, for the adaptive algorithm, real time operation, high speed and high precision 
are required. 

Various structures have been proposed for echo cancellers [25]-[27]. Figure 5.18 shows a 
typical structure of an echo canceller. In the pseudo echo generating part, real-time operation can 
be performed, and high stability is guaranteed. In addition, a transversal type adaptive filter for 
which the method in estimating the transmission characteristics is established is often used [28], 
[29]. Also, various schemes for estimating the echo path have been proposed (for details, the 
reader is referred to Book 2, Chapter 10). Usually, a learning identification method (NLMS) that 
has a smaller computing quantity and allows real-time operation is adopted. 



I 0 

i 

i 

J n 
T"~ £/) : wt«- - 

Figure 5.18 Structure of the echo canceller. 
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Pseudo echo 
Receiving signal 
Residual signal 

When echo canceling is to be realized, successive treatment is necessary with all of the 
arithmetic and logic operations finished within each sample, which requires a high arithmetic and 
logic operation clock speed that is proportional to the echo canceling time. For example, 
assuming that p represents the operation clock time (corresponding to the CPU operation 
processing time in DSP) and n represents the tap number, the product addition operation 
(convolution operation) becomes pxn, the operation of the correction quantity (depending on the 
algorithm) is pxnxCi, the data transfer is pxnxC 2 , the double torque control is pxC 3 , and the sum 
of the treatment times for the arithmetic and logic operation becomes ^^#X [jsX (d+ 
C*+l) £3tJ . Here, when the tap number n (corresponding to echo canceling time T, and 
n = T/T s represents the sampling time) is increased, one has £^P^&^&(k^Ct^C#) . Also, the 
generally necessary memory number is 2X *»4*£T. The first item refers to the memory required 
for the convolution operation, and the second item is for computing the correction coefficient. 
Consequently, when DSP is used, it is necessary to select a DSP that is appropriate for the echo 
canceller based on the operation processing speed and the memory capacity. Also, it is important 
to select possible types that allow subordinate connection of the DSP. 

5.3.3 Echo canceller for long distance line 

As a device for echo canceling, an echo suppressor has been used. Also, recently, an echo 
canceller free of a broken sensation in conversation has been introduced. For an echo canceller, 
in 1966, M. Sondhi of the ATT Bell Lab described an adaptive algorithm in theory [80]. 
However, with the analog technology available at that time, it was hard to realize a complicated 
adaptive algorithm and the echo canceling quantity demanded for an echo canceller. 
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Figure 5.19 Generation of echo in telephone conference service (4 pairs connected to each other). 
Key: 1 Telephone set 

By the 1970s, with advent of digital signal processing technology that facilitated 
complicated and high precision arithmetic and logic operations and the rapid development of LSI 
technology, echo cancellers could be produced with small and low-cost hardware. The first echo 
canceller developed for satellite/domestic use was created by Duttweiller of the ATT Bell Lab., 
and the echo canceling time then was 8 ms (64 taps) [81]. Then, ATT and KDD developed a 
single-chip LSI with an echo canceling time of about 60 ms (corresponding to 480 taps). At 
present, echo cancellers are rapidly being introduced into satellite/domestic lines. 

5.3.4 Echo canceller for telephone conference for persons at plural sites 

Echo canceling technology is adopted not only in cases in which echo is simply 
cancelled, but also in treating various phenomena that take place due to echo, such as canceling 
hauling and noise signals. 

With a telephone conference service that began in 1986, the system allows simultaneous 
connection among 30 pairs of unspecified subscribers [53]. Echo canceling technology is also 
adopted in this system. Figure 5.19 shows the state of generation of echo in the case of four pairs 
of connection among different sites. In the case of multi-pair connection, plural voices are added 
simultaneously, so that plural echoes are also added. Consequently, the total gain of the system is 
greater than one, and, in the worst case, hauling (shrieking) occurs, or quasi-hauling takes place. 
In this state, the voice of the conversation counterpart cannot be heard, and only hauling is heard. 
Consequently, by canceling the echo generated in a hybrid transformer corresponding to each 
subscriber, this phenomenon can be prevented. Usually, assuming that the number of the desired 
echo canceling pairs is n, the value should be 10-log n or larger [23]. 
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5.3.5 Echo canceller for relaying received signal 

For an analog telephone network, the subscriber information is stored in a local 
exchange, and most local exchanges are 2-wire type analog exchanges, and first, a common line 
signal system that transfers the control signal between exchanges must be introduced to it. 
Consequently, it is impossible to directly transmit a sending call to the destination, and an 
incoming call should be sent to the destination. Consequently, for an analog network, the 
transmission loss between the subscribers via the communication network increases, and a 
2-wire line type bidirectional amplifier is needed for transmitting to the destination at a 
conventional volume. Figure 5.20 shows the structure of a bidirectional amplifier having the 
echo canceller introduced in it. An AGC for automatic compensation of the transmission loss is 
necessary. In addition, the loop gain in the 4-wire line region including the amplifier should be 
greater than one. Consequently, stable operation required increasing the AGC gain while having 
the AGC gain and the echo canceller canceling quantity cancel each other. At present, for a 
telephone line, the voice power in one direction can be compensated up to 24 dB [34]. 

5.3.6 Echo canceller for voice answering 

In order to accommodate plural subscribers in a voice answering service, it is preferred 
that operation be performed with an exchange outside the city. Also, because a timer system or 
other voice answering service has a fixed-shape answering form, usually, an answer is given 
while an announcement has not finished. In this case, a voice signal and a push-phone multi- 
frequency signal, and other signals are sent as the answer signal. Consequently, on the receiving 
side of the trunk device of the out-of-town exchange, as described in Section 5.3.3, the voice 
answer announcement signal and the answer signal are mixed by echo and are received. 
Consequently, the SN deteriorates, and, in the worst case, the answer signal may be judged 
incorrectly. As a result, in order to improve the instant response to the answer and to improve the 
correct understanding rate, an echo canceller is used. 

5.3.7 Acoustic echo canceller 
(1) TV-conference 

Usually, the reverberation time of a conventional conference room is in the range of 
100-400 ms. When this is represented by a transversal filter with 8 kHz of sampling, it is about 
4000 taps [35]. In addition, for a high quality voice band (7 kHz or higher), because the sampling 
frequency is 2-6 times higher, the tap number is huge. 

When the acoustic echo is a direct wave from a speaker to a microphone, an indirect 
wave (so-called reverberation) caused by multiple reflections from the wall surfaces inside the 
room is superposed, and its impulse response is significantly different from the impulse response 
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of the line echo. People and objects move or the indoor temperature varies over time, so the 
characteristics also vary significantly [38]. Also, many noises, such as the noise of an air 
conditioner, voices from surrounding persons, etc., cause deterioration in the performance of the 
echo canceller. In addition, realization of the hardware is much more difficult than other types of 
echo cancellers. 
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Figure 5.20 Structure of 2-wire bidirectional relay amplifier. 
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Consequently, it is possible to guarantee real-time operation by adopting the following 
methods: the tandem connection method [37], in which the necessary operation quantity and the 
storage capacity for each chip can be reduced by connecting plural chips in tandem and 
performing parallel treatment, and the band dividing method [38], in which the high band side is 
folded back to the low frequency side by means of modulation, so that the sampling frequency is 
lowered. Also, the convergence time that obtains the desired echo canceling quantity in the long 
tap length is longer in the learning identification method. Consequently, reduction in the 
convergence time by means of the following methods has been investigated: non-correlation 
method [39] that converts the input voice signal to a white signal, or plural band dividing method 
[40], and adaptive filter coefficient variation method [41] using the index attenuation 
characteristics of the impulse response. 
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Figure 5.21 Structure of voice-amplified telephone using echo canceller. 
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(2) Voice-amplified telephone (automobile telephone) 

For a voice-amplified telephone with which calling is possible without a handset, due to 
the acoustic coupling between the speaker and the microphone, both acoustic echo and line echo 
due to a hybrid transformer are generated. For the voice-amplified telephone, in consideration of 
operability, the speaker and the microphone are usually integrated [42]. Because a low cost is 
necessary for a voice-amplified telephone, the tap number of the transversal type filter cannot be 
large. Consequently, it is used along with an echo compressor specifically to prevent the hauling 
phenomenon. Figure 5.21 shows an example of the structure of a voice-amplified telephone 
using an echo canceller. 

Also, for a voice-amplified telephone, in order to guarantee safe driving, it is also 
adopted as a telephone for automobiles [43]. Since the space inside the automobile cabin is 
small, the attractive force of the sheet is high, so the reverberation time is shorter than in a 
conventional indoor environment, and the conversation bandwidth is 0.3-3.4 kHz. The hardware 
is smaller than that used in TV conferencing. However, the noise level is high during operation, 
and, due to the small space, movement of persons inside the automobile leads to variation in the 
transmission characteristics of the echo path. Consequently, the actual application environment is 
much more severe. 

(M. Shimada) 
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5.4 Wireless communication 

5.4.1 Application of digital signal processing in wireless communication 

Typical wireless communication systems include mobile communication, stationary 
wireless communication, and satellite communication [44]-[47]. They have significantly 
different parameters and application conditions. Technical development has been from the 
viewpoint that the limited wireless frequency spectral resources should be efficiently used. In 
order to improve the performance and the reduce the size of the equipment, it is necessary to 
introduce digital signal processing (see Table 5.4). However, in applications in wireless 
communication, © spurious transmission is strictly defined by the Electromagnetic Wave Law; 
© the dynamic range of the receiving wave level is wide; © the carrier frequency of the 
transmission modulated wave is high,. ..[end of available text] 



Table 5.4 Digital signal processing in wireless communication. 
(7) as, mmmCZT. 
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Key: 1 Purpose 

2 Application examples 

3 Higher performance 

4 Smaller size 

5 Lower price 

6 Confidentiality of communication 
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7 High precision waveform processing 

■ Modem multi-value scheme 

High precision treatment is performed by means of A-D, D-A conversion 

• Roll-off wave shaping 

High precision waveform generation by FIR filter 

■ Burst transmission waveform shaping 

High precision waveform generation by burst window 

8 The following controls are performed in fast phasing 

• Phase control in carrier reproduction 

Prediction of carrier phase by AR model and removal of noise influence 

• Equalization and diversity coefficient control 

Optimum control by RLS 

9 High-reliability transmission processing 

• Correction of code error and demodulation of encoded modulated wave 

Branch metric computing of MLSE and ACS 

• Phasing counter measure of voice codec 

Protection of prediction parameter 

10 Fast response 

• Synthesizer for frequency hopping 

High-precision, high-speed waveform generation by D-A conversion 

1 1 Formation of digital LSI for members 

12 [illegible] adjustment 

• Automatic calibration treatment 

AFC, distortion compensation 

13 High-level scramble treatment 
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